24 research outputs found
Probabilistic Polynomials and Hamming Nearest Neighbors
We show how to compute any symmetric Boolean function on variables over
any field (as well as the integers) with a probabilistic polynomial of degree
and error at most . The degree
dependence on and is optimal, matching a lower bound of Razborov
(1987) and Smolensky (1987) for the MAJORITY function. The proof is
constructive: a low-degree polynomial can be efficiently sampled from the
distribution.
This polynomial construction is combined with other algebraic ideas to give
the first subquadratic time algorithm for computing a (worst-case) batch of
Hamming distances in superlogarithmic dimensions, exactly. To illustrate, let
. Suppose we are given a database
of vectors in and a collection of query vectors
in the same dimension. For all , we wish to compute a
with minimum Hamming distance from . We solve this problem in randomized time. Hence, the problem is in "truly subquadratic"
time for dimensions, and in subquadratic time for . We apply the algorithm to computing pairs with maximum
inner product, closest pair in for vectors with bounded integer
entries, and pairs with maximum Jaccard coefficients.Comment: 16 pages. To appear in 56th Annual IEEE Symposium on Foundations of
Computer Science (FOCS 2015
OV Graphs Are (Probably) Hard Instances
© Josh Alman and Virginia Vassilevska Williams. A graph G on n nodes is an Orthogonal Vectors (OV) graph of dimension d if there are vectors v1, . . ., vn ∈ {0, 1}d such that nodes i and j are adjacent in G if and only if hvi, vji = 0 over Z. In this paper, we study a number of basic graph algorithm problems, except where one is given as input the vectors defining an OV graph instead of a general graph. We show that for each of the following problems, an algorithm solving it faster on such OV graphs G of dimension only d = O(log n) than in the general case would refute a plausible conjecture about the time required to solve sparse MAX-k-SAT instances: Determining whether G contains a triangle. More generally, determining whether G contains a directed k-cycle for any k ≥ 3. Computing the square of the adjacency matrix of G over Z or F2. Maintaining the shortest distance between two fixed nodes of G, or whether G has a perfect matching, when G is a dynamically updating OV graph. We also prove some complementary results about OV graphs. We show that any problem which is NP-hard on constant-degree graphs is also NP-hard on OV graphs of dimension O(log n), and we give two problems which can be solved faster on OV graphs than in general: Maximum Clique, and Online Matrix-Vector Multiplication
Limits on the Universal Method for Matrix Multiplication
In this work, we prove limitations on the known methods for designing matrix multiplication algorithms. Alman and Vassilevska Williams [Alman and Williams, 2018] recently defined the Universal Method, which substantially generalizes all the known approaches including Strassen\u27s Laser Method [V. Strassen, 1987] and Cohn and Umans\u27 Group Theoretic Method [Cohn and Umans, 2003]. We prove concrete lower bounds on the algorithms one can design by applying the Universal Method to many different tensors. Our proofs use new tools for upper bounding the asymptotic slice rank of a wide range of tensors. Our main result is that the Universal method applied to any Coppersmith-Winograd tensor CW_q cannot yield a bound on omega, the exponent of matrix multiplication, better than 2.16805. By comparison, it was previously only known that the weaker "Galactic Method" applied to CW_q could not achieve an exponent of 2.
We also study the Laser Method (which is, in principle, a highly special case of the Universal Method) and prove that it is "complete" for matrix multiplication algorithms: when it applies to a tensor T, it achieves omega = 2 if and only if it is possible for the Universal method applied to T to achieve omega = 2. Hence, the Laser Method, which was originally used as an algorithmic tool, can also be seen as a lower bounding tool. For example, in their landmark paper, Coppersmith and Winograd [Coppersmith and Winograd, 1990] achieved a bound of omega <= 2.376, by applying the Laser Method to CW_q. By our result, the fact that they did not achieve omega=2 implies a lower bound on the Universal Method applied to CW_q. Indeed, if it were possible for the Universal Method applied to CW_q to achieve omega=2, then Coppersmith and Winograd\u27s application of the Laser Method would have achieved omega=2
Faster Walsh-Hadamard Transform and Matrix Multiplication over Finite Fields using Lookup Tables
We use lookup tables to design faster algorithms for important algebraic
problems over finite fields. These faster algorithms, which only use arithmetic
operations and lookup table operations, may help to explain the difficulty of
determining the complexities of these important problems. Our results over a
constant-sized finite field are as follows.
The Walsh-Hadamard transform of a vector of length can be computed using
bit operations. This generalizes to any transform
defined as a Kronecker power of a fixed matrix. By comparison, the Fast
Walsh-Hadamard transform (similar to the Fast Fourier transform) uses arithmetic operations, which is believed to be optimal up to constant
factors.
Any algebraic algorithm for multiplying two matrices using
operations can be converted into an algorithm using bit operations. For example, Strassen's algorithm can
be converted into an algorithm using bit
operations. It remains an open problem with practical implications to determine
the smallest constant such that Strassen's algorithm can be implemented to
use arithmetic operations; using a lookup
table allows one to save a super-constant factor in bit operations.Comment: 10 pages, to appear in the 6th Symposium on Simplicity in Algorithms
(SOSA 2023
How to Capture Higher-order Correlations? Generalizing Matrix Softmax Attention to Kronecker Computation
In the classical transformer attention scheme, we are given three size matrices (the query, key, and value tokens), and the goal is
to compute a new size matrix where . In this work, we study a
generalization of attention which captures triple-wise correlations. This
generalization is able to solve problems about detecting triple-wise
connections that were shown to be impossible for transformers. The potential
downside of this generalization is that it appears as though computations are
even more difficult, since the straightforward algorithm requires cubic time in
. However, we show that in the bounded-entry setting (which arises in
practice, and which is well-studied in both theory and practice), there is
actually a near-linear time algorithm. More precisely, we show that bounded
entries are both necessary and sufficient for quickly performing generalized
computations:
On the positive side, if all entries of the input matrices are
bounded above by then we show how to approximate the
``tensor-type'' attention matrix in time.
On the negative side, we show that if the entries of the input
matrices may be as large as , then there is no
algorithm that runs faster than (assuming the Strong Exponential
Time Hypothesis from fine-grained complexity theory).
We also show that our construction, algorithms, and lower bounds naturally
generalize to higher-order tensors and correlations. Interestingly, the higher
the order of the tensors, the lower the bound on the entries needs to be for an
efficient algorithm. Our results thus yield a natural tradeoff between the
boundedness of the entries, and order of the tensor one may use for more
expressive, efficient attention computation
Further Limitations of the Known Approaches for Matrix Multiplication
We consider the techniques behind the current best algorithms for matrix
multiplication. Our results are threefold.
(1) We provide a unifying framework, showing that all known matrix
multiplication running times since 1986 can be achieved from a single very
natural tensor - the structural tensor of addition modulo an integer .
(2) We show that if one applies a generalization of the known techniques
(arbitrary zeroing out of tensor powers to obtain independent matrix products
in order to use the asymptotic sum inequality of Sch\"{o}nhage) to an arbitrary
monomial degeneration of , then there is an explicit lower bound,
depending on , on the bound on the matrix multiplication exponent
that one can achieve. We also show upper bounds on the value that one
can achieve, where is such that matrix
multiplication can be computed in time.
(3) We show that our lower bound on approaches as goes to
infinity. This suggests a promising approach to improving the bound on
: for variable , find a monomial degeneration of which, using
the known techniques, produces an upper bound on as a function of .
Then, take to infinity. It is not ruled out, and hence possible, that one
can obtain in this way.Comment: 16 pages. To appear in 9th Innovations in Theoretical Computer
Science Conference (ITCS 2018
Matrix Multiplication and Number on the Forehead Communication
Three-player Number On the Forehead communication may be thought of as a three-player Number In the Hand promise model, in which each player is given the inputs that are supposedly on the other two players\u27 heads, and promised that they are consistent with the inputs of the other players. The set of all allowed inputs under this promise may be thought of as an order-3 tensor. We surprisingly observe that this tensor is exactly the matrix multiplication tensor, which is widely studied in the design of fast matrix multiplication algorithms.
Using this connection, we prove a number of results about both Number On the Forehead communication and matrix multiplication, each by using known results or techniques about the other. For example, we show how the Laser method, a key technique used to design the best matrix multiplication algorithms, can also be used to design communication protocols for a variety of problems. We also show how known lower bounds for Number On the Forehead communication can be used to bound properties of the matrix multiplication tensor such as its zeroing out subrank. Finally, we substantially generalize known methods based on slice-rank for studying communication, and show how they directly relate to the matrix multiplication exponent ?
Dynamic Parameterized Problems and Algorithms
Fixed-parameter algorithms and kernelization are two powerful methods to solve NP-hard problems. Yet, so far those algorithms have been largely restricted to static inputs. In this paper we provide fixed-parameter algorithms and kernelizations for fundamental NP-hard problems with dynamic inputs. We consider a variety of parameterized graph and hitting set problems which are known to have f(k)n^{1+o(1)} time algorithms on inputs of size n, and we consider the question of whether there is a data structure that supports small updates (such as edge/vertex/set/element insertions and deletions) with an update time of g(k)n^{o(1)}; such an update time would be essentially optimal. Update and query times independent of n are particularly desirable. Among many other results, we show that Feedback Vertex Set and k-Path admit dynamic algorithms with f(k)log O(1) n update and query times for some function f depending on the solution size k only.
We complement our positive results by several conditional and unconditional lower bounds. For example, we show that unlike their undirected counterparts, Directed Feedback Vertex Set and Directed k-Path do not admit dynamic algorithms with n^{o(1) } update and query times even for constant solution sizes k <= 3, assuming popular hardness hypotheses. We also show that unconditionally, in the cell probe model, Directed Feedback Vertex Set cannot be solved with update time that is purely a function of k
Tensors Ranks and the Fine-Grained Complexity of Dynamic Programming
Generalizing work of K\"unnemann, Paturi, and Schneider [ICALP 2017], we
study a wide class of high-dimensional dynamic programming (DP) problems in
which one must find the shortest path between two points in a high-dimensional
grid given a tensor of transition costs between nodes in the grid. This
captures many classical problems which are solved using DP such as the knapsack
problem, the airplane refueling problem, and the minimal-weight polygon
triangulation problem. We observe that for many of these problems, the tensor
naturally has low tensor rank or low slice rank.
We then give new algorithms and a web of fine-grained reductions to tightly
determine the complexity of these problems. For instance, we show that a
polynomial speedup over the DP algorithm is possible when the tensor rank is a
constant or the slice rank is 1, but that such a speedup is impossible if the
tensor rank is slightly super-constant (assuming SETH) or the slice rank is at
least 3 (assuming the APSP conjecture). We find that this characterizes the
known complexities for many of these problems, and in some cases leads to new
faster algorithms